US PATENT & TRADEMARK 


2 wa 


3 0402 00084381 


STS Hemet 


ies ete 


pbieprapebtitesy Spt 
Se pate tea hence 


ETet abate is 


Fx 
tyahiNpee 


th 
m ee s 
packs : 


Seti 


Selcnpatiee 


Eien 1 ee 


har 


ee 
a 


Bees as Baie 
ve seen te ie 


“ena 
wva eel 


ns 
artis 
press 


OFFICE 


UL 


5 


Sato 
Seah iooee: ohees 
eee oe 


mH 


Gist 


peer nr 
Bias 


els 


EG wateyariey~t/ a ep be 
ie ey ar 


ae 


ro phates ent uta 
ra ie 


oS 


fiisams 


Rts 


Gi Tity 
i 


stcrerecers 
lechaoe 


Soa 
= Pet 


Fetdaptlvengianeksqecet ps 
bate pe stab ebay bt 
rie ae 
biasseat te 
Serres ‘ 


Sas 


fiytatdmaemante tnt yistaebary 
lgletisevie tenner tneienry, 


Haslesppgsistinne soecrecaoe 
pees rast 


a ae 


pe 

tied At egies: 
insdeaited mente 
mar hine ae! it 
Teovhtren theeeaateirh Serkee mst 


ay ae 


aera 
Wl 
Bias 
Sites 
shia 


ran 


nee 
sa 
eA 


says 


ss 
yah sh iPoy eset 
sion babane ede 
a renters Tce 
fe fies atest util timsooa ne eats sree 
Scheer alors ancnne Pac so ttcrineccendpnrn eenr ns erie tne ears A 
ieee ne taeene seers lpe tat oeprenecnomcrn ain eto 
denna eprtinesien erste Bobeareecserecets aie oa 
iba eet z 


anys fore 


seca ntcauattaea 
mere sty 


Teacreabeesinioeuertesar tes Sites 

vi mora ite eee Satie eettenteantas mae 
harbese:ehveeil sees Seabesines 

Saetriee: paved meryarienad 


eee ernest tates 
errata mitt 
ote oestremnrnerttae 
oieearet bcbebisebeabeassetaeelisdenacted 68 
Hesse 


esa beestong ara pre rorardore sere 
tired psy roeenne porno So 
seperti orehoctenareoue avy 


rate 


ofan sere Mee abn rte Splat 
ae: 7 
Raia nape) isis dey wonpetubgivanietninesorects 


OR eile 


erone 


caren 


4 
saan set 
rome 


releases 


ad 


emit in 


ae 


init 
Se aN ey 
Olen iis he 


oat 
Ma 
tes 


Siac rtaea 
soos Ip 


eo 


fae ansate ations 
Phcaa 

estan 
Deiat ium yi brbces 


ae 


ai 


ese ae 


Heke 
parent are if 


Pitts 
ite: ets 


vf 
Pe iarhicnihies (ara 
NaBUpeem oe 
ia 


doit 
i pide 


brotesatald 


PELEDCESEDDIL IDI AIOIOATS, 


Oo 
CO CIENTIFIC AND TECHNICAL 


INFORMATION CENTER 


eee “ 


CONTENTS 


Report Number Titie Author 


a Storage and Ketrieval of Contents of 
Technical Literature, May 1956.......Don D. Andrews et al. 


De Advances in Mechanization of Patent 
Searching, April 1956........ secoveceDe Se Ganhanmenumaies 
Bie Problems in Mechanizing the Search in 


Examining Patent Applications, 
October 1956...... 00000006 scclees clels eS LMOT MiteinNe anna 


Le Storage and Hetrieval of Contents of 
Technical Literature, June 1957.....eSimon M. Newman 


Se Organization of Chemical Disclosures 
for Mechanized Retrieval, June 1957..B. E. Lanham et al. 


6, (IEAS)), June® 1957 c.eiec0/olsre1 ove oloiclercletelelctere OLED SmnrtrT Chaenacs 


to A Punched Card System for Searching 
Steroid Compounds, July 1957.......eod~» Frome et al, 


8. Recent Advances in Patent Office 
Searching; Steroid Compounds........ed. Frome et al, 
and ILAS, 1957...........+.sDon D. Andrews 


Vo Linguistic Problems in Mechanization i 
or patent Searching, 1957........e.seeSimon M. Newman ’ 


10. A First Approach to Patent Searching 
Procedures on SEAC, January 1958....H. Pfeffer et al, 


Wiig A Manual for Coding Steroids, 
November 1958.2... .cessccccececctcececdiv: EOMeme LmaEne 


WE Storage and Retrieval of Contents of 
Technical Literature, November 1958..Simon M. Newman 


IS} A System of Retrieval Compounds, 
Compositions, Processes and 
Polymers, November 1958,......eees.eed~ Frome et al, 


li, Variable Scope Patent Searching by an 
Inverted File Technique, Nov. 1958...J. Leibowitz et al, 


Report Number Title Author 


Noe A Notation System for Transliterating 
Technical and Scientific Texts for 
Use in Data Processing Systems, 
May 1959 ¢00+ 0 00s vee vie ele esis ove sic s e/eleeOLMOr Meme Wncr mera 


1G. Anelysis of Prepositionals for 
Interrelated Tonceots, July 1959.....Simon M. Newman 


lie Semi-automatic Inuexing and Encoding, ! 
December G59) tors ckeleleleleyotonelstctenetstetonsletet repeating Onc 


13's Mechanized Searching of Phe sphorus 
Compounds, January 1961.......-e+eceed. Frome 


LY, Revised Steroid Search Systems Toding 
Manuals, anuary 0 Olly ele stoscrclslelelefeleterstetato om aOmle 


2075 Parameters for an Information ketrieval 
System for Chemical Processes, 
April L961, c/s seleiereeleise sles cleicivislclolelelettel Weill SC ime Omen 


ol. A New Sequential Enumeration anu Line 
Formula Notation System for Organic 
Compounds, November 1951......ee+eee0de We Hayward 


CB Manual for a Punched Card Retrieval 
System for Organic Phosphorus 
Compounds, November 1961.........+-eecds Hrome et al, 


Patent Office 


Research and Development 


Reporis 


STORAGE AND RETRIEVAL OF 
CONTENTS OF TECHNICAL 
LITERATURE 

NONCHEMICAL INFORMATION 


Preliminary Report 
May 15, 1956 


"FOR OFFICIAE DISTRIBUTION 


Prepared by 


Don D. Andrews 
Director 


Simon M. Newman 
Patent Research Specialist 
Staff Member 


Office of Research and Development 
U. S. Patent Office 


Robert C. Watson 
Commissioner of Patents 


Sinclair Weeks 
Secretary of Commerce 


OFFICE OF RESEARCH AND DEVELOPMENT REPORTS 


Errata 
Storage and Retrieval of Contents of Technical Literature 


Non-Chemical Information 


SCIENTIFIC LIBRARY 
Preliminary Report 5 G 
i APR 21976 
1) Sip COIL al, al, Si7/ for "clips", read -chips- PAT. & TM. OFFICE 
1. 44 for "principal", read -principle- 
19) ayy ceXelh Ab, dl SYS for “empackaged, read -enpackaged- 
p 6, schedule 2, meaning 4, under Examples of Use, last example, 
insert -thru- after "book 
p 7, schedule 6, under Explanation, 11.1 & 2, 11.3 & 4 and 1l. 
5 & 6 Should be interchanged. 
schedule 7, under Concepts and Interfixes 11.2 & 5 should 
be interchanged. 
D 9; .colmliuelacer for "except" read -concept- 
p 11, schedule 15, the title Descriptor and Modulant should be 
supplied for col. 2. 
> COM ise elias after "item" insert ~below as- 
schedule 17, under the title Concept and Interfix, the 


numeral -4- should be inserted in ll. 2, 4 
and 6 immediately above the "4" in 11.7, 9, 
10 & 12. 


Problems in Mechanizing the Search in Examining Patent Applications 


7 Coll Ny db QS for "sytem" read -system- 

p 15, col 1 between 11.7 & 4, insert, -which might have any of the 
details for- 

p 21 the title, "Exhibit 9" at the bottom of the 
page should appear directly below the illus-~ 
tration at the top of the page. 


04196290 


STORAGE AND RETRIEVAL OF CONTENTS OF TECHNICAL LITERATURE 
NONCHEMICAL INFORMATION 


INTRODUCTION 


This is the first of a series of papers reporting 
the results of research directed to the storing and 
retrieval of scientific information disclosed in the 
contents of non-chemical patents and other technical 
documents. ‘This paper constitutes a preliminary 
report of the work done to date, andof our thinking 
at this time. Changes and developments will be 
reported in future papers. 

This research is a second step by the Office of 
Research and Development, (U. S. Patent Office) 
in its study of methods of storing and retrieving 
information in scientific disclosures. A chemical 
task-force has already reported (1) onthe develop- 
ment of a proposed system for handling chemical 
disclosures. All of the research in this field done 
to date by the chemical task-force has been care- 
fully studied and used by us, andcredit for many of 
the ideas here propounded is freely given them. 

In any system which may result from this re- 
search, there are at least two features which will 
be desirable. First the system, in so far as pos- 
sible, should be compatible with the system under 
development by the Chemical task-force which will 
be used to search the Chemical Patents and Lit- 
erature. And secondly, it should be able to encode 
for retrieval, any disclosed feature of the document 
being encoded. Policy will later dictate what fea- 
tures, if any, should be omitted in the encoding 
process. 

However, in determining what portions should 
be omitted when encoding a document, it must be 
kept in mind that many processes disclose a by- 
product which may not seem important or even 
relevant at the time of coding. A very specific 
machining operation upon a metal blank to form a 
particular machine part may incidentally leave a 
pile of intertangled metal clips, which are dis- 
closed as mere scrap. However, this disclosure 
may be the very reference which will be later 
wanted in a search for the manufacture of steel 
wool, and failure to encode the scrap as one prod- 
uct of the machining operation, if the other prod- 
uct is encoded, can be seen to be anerror in 
principal. 

Since most disclosures of scientific information 
are already linguistically expressed, and it seems 
clear that others in the form of drawings, tables, 
photographs, models, working machinery, etc. are 
translatable into language form, it was felt that a 
purely linguistic approach to this problem war- 
ranted investigation. As this particular study 
progressed we slowly came to the conclusion that 
we could not use either the word position ina 
sentence, or the grammatical construction of the 
sentence in the solution of this problem. 


"LANGUAGE" TO BE USED 


Professor Stuart C. Dodd, in his paper “Model 
English”(2) has suggested the development of a 
“Ruly” English (as opposed to the common “un- 
ruly” language which we now have) in which every 
word would have one, and only one conceptual mean- 
ing, and in which each and every concept would 
have only a single word to describe it. Our study 
contemplates, at least in part, the creation and 
use of such a Ruly Language—however a special- 
ized one, wherein the “words” will be designed and 
adapted for information storage and retrieval, and 
not necessarily styled for conversation or writing. 

It is unfortunate that in the English language 
there has been no uniform or logical rule for the 
naming of devices or things. A few things are 
named for their shape, for example, a block or a 
ring. Others are named for the material from 
which they are made, for example, a glass. The 
great bulk of things which we refer to are given 
functional names because of the process they per- 
form, for example, a press or a hammer; or for 
the use to which they are put, for example, a re- 
ceptacle or a cover. Others are named for the lo- 
cation from which they first came, for example, 
china. Some few have arbitrary names, for exam- 
ple, a pencil or an ax. Thus, except for those 
named for their shape (which constitute the only 
words truly descriptive of the static structure of a 
thing) and those named arbitrarily, we see that 
the names, themselves, are in reality, either 
broad relationships with other things which are 
not recited or broad statements of processes of 
use of the thing. 

In the absence of a Ruly English, we may have 
to use several words joined together by hyphens(-) 
to simulate a Ruly English word inexamples of our 
coding system. 


A SIMPLIFIED EXAMPLE OF ENCODING 


An analysis of the data needed to be encoded for 
retrieval indicates that such data is reducible to 
the recitation of either (1) named things alone, or 
with descriptive explanations thereof, or (2) twoor 
more of these “named things” with a stated re- 
lationship of each to one or more of the others. 

Let us consider a very simple disclosure, viz: 
A table standing on the floor, and an ash tray and 
book spaced alongside each other on thetable. The 
table constitutes a support, it has plural legs, its 
top is flat, and it supports a book and an ash tray. 
The book has a leather cover, it is colored blue; 
the ash tray is made of metal, is used as a con- 
tainer and lies adjacent to the book. 


\ aQe 


Each of the terms enumerated in this disclosure 
which recites a characteristic of one of the dis- 
closed things may be translated into a Ruly word. 
Upon analysis, it will be seen that each such word 
describes the thing from one aspect. 

We will accordingly group the words which 
characterize each thing. Additionally, where a 
specific interrelation between two or more things 
is stated, we will place into each group, one of a 
set of cognate words, which words together will ex- 
press this interrelation. 

These cognate words will be “mirror-images” of 
the explicit interrelationship expressed between the 
things described by the words of the two groups and 
will act to couple the groups by this relationship. 
E.g., if A supports B we will put “supporting’’ with 
A and “supported-by” with B. If P is greater than 
Q, we will put “greater-than” with P and “lesser- 
than” with Q. And if M is equal to N, we will put 
“equal-to” with both M and N. 

We will also place an arbitrary but identical num- 
ber with each cognate word to indicate which words 
constitute the interrelated couple between the 
groups. A second set of cognate words describing 
a second and different relationship, i.e., a dif- 
ferent couple, will therefor carry different, ar- 
bitrary, identical numbers. 

For convenience of expression, each individual 
word will be called a “descriptor” and the group 
of descriptors relating to one named thing will be 
called an “item.” Each of the items wiil be num- 
bered consecutively, only for ease in referring to 
them, since it will be later shown that their order is 
not material. The process of showing interrela- 
tional concepts of two things by placing descriptors 
of their interrelation in each of the items so re- 
lated will be called “distribution.” The arbitrary 


Item # Descriptors 


100 Table 
Support 
Multi-legged 
Flat-top 
Supported-by 1 
Supporting 2 


101 Floor 
Supporting 1 


102 Book 
Leather-binding 
Blue-colored 
Supported-by 2 
Adjacent-to 3 


103 Ash-tray 
Container 
Metallic 
Supported-by 2 
Adjacent-to 3 


Interfix 


SCHEDULE 1 


number indicating the parts of a distributed con- 
cept to be coupled will be called an “interfix.” 
Since the numerical value of the interfix merely 
shows identity of the distributed concepts, the re- 
trieval of such distributed concepts can only be 
made on requests for identity of the interfix. This 
type of retrieval will be called “blind retrieval.” 

The disclosure recited above might be illustrated 
as shown in Schedule 1. 

In the preceding example, we see four items, 
100 listing the descriptors of the table, 101 the 
floor, 102 the book and 103 the ash tray. By the 
use of the separate descriptors in each item, we 
have described each thing as a whole by each of 
its various disclosed descriptive characteristics. 

By the use of the interfix #1 we have linked to- 
gether the descriptor “supported-by” relating to the 
table (100) and the descriptor “supporting” of the 
floor (101), designating the supporting of the table 
on the floor. In the same manner the interfix #2 
designates the book and ash tray supported by the 
table and interfix #3 designates that the book and 
ash tray are each “adjacent-to” the other. 

It can thus be seen that the things itemized may be 
retrieved either generically or in detail, and in 
either combination or subcombination, with or with- 
out their interrelations. Also, and most important- 
ly from the Patent Office point of view, the in- 
terrelationships there recited may be searched for 
and retrieved, independently of the details of the 
specific things recited. For example, not only can 
we retrieve a metallic ash tray supported upon a 
multi-legged table, but we can retrieve a container 
upon a support or a blue thing and a metal thing 
adjacent to each other, whether or not they have a 
common support. 

This system may also be utilized in process or 
method disclosures, an example of which will be 
given later (Schedule 9). 

Some refinements which have been considered 
will now be discussed. None of these can be said 
to have been completely debugged. 


USE OF WORD ROOTS WITH MODULANTS 


We have previously pointed out that the names of 
things often state various relationships. Upon anal- 
ysis, both names and other words used as de- 
scriptors usually infer either a broad relationship 
with some other unidentified thing, or an indefinite 
or undefined relationship with the specific thing 
being itemized. E.g., in Schedule 1 theword “con- 
tainer” in item 103 indicates that the thing (ash- 
tray) itemized in 103 is a “container” or “holder” 
for something not itemized, while the “leather- 
binding” of item 102 indicates that the thing (book) 
itemized in 102 has previously been thru some in- 
definite and undefined process called “binding.” 

Let us postulate a Ruly word ENPACKAGE, de- 
fined in unruly English as the process of surround- 


-4- 


ing some other thing with an enclosure, wrapper or 
container to form a package. This concept has 
been clearly, and we believe unambiguously, stated 
elsewhere(3), and includes within its scope, the 
process of enclosing a candy bar inatin- foil wrap- 
per, filling a preformed paste-board box with cereal 
and closing the box, or enclosing loose tea in 
permeable material to make a tea bag. 

In addition to (1) “the process” enpackage, there 
come to mind additional corollary concepts such 
as— 

(2) The “enpackager” or package-maker, i.e., 
the performer of enpackage, known in the Patent 
Office as “the apparatus.” 

(3) The thing-made-by-enpackage, i.e., a pack- 
age, known in the Patent Office as “the final prod- 
uct” or more simply, “the product.” 

(4) The filling-(or contents-)-which-was-“en- 
packaged,” i.e., the candy bar, the cereal or the 
tea, known in the Patent Office as the “starting 
material,” or sometimes, in other types of proc- 
esses as “the stock material.” 

(5) The partially-completed-product-at-some- 
point-during-enpackage, i.e., the candy bar par- 
tially wrapped, or the full but still open cereal 
box, known in the Patent Office as “the inter- 
mediate product,” or sometimes, in other types of 
processes as “the blank.” 


(The product (3), the starting material (4) and 
the intermediate product (5) are all part of a 
genus known in the Patent Office as “the work,” in 
other words, the-thing-worked-on-in-enpackage(or 
by-the-“enpackager’” )). 

(6) The-condition-of-being-“enpackaged” as dis- 
tinguished from being “unempackaged” or loose. 

(7) Another-thing-made-from-a-product, e.g., a 
cup of tea made from a tea bag. (This thing, of 
course, is a product of another process--the- 
process -of-making-tea.) 

(8) A-larger-thing-of-other-characteristics-of- 
which-the-product-is-a-part, e.g., a merchandis- 
ing display of a product. This would be known in 
the Patent Office as a combination including the 
product asa subcombination. Note that concept (7) 
is a genus of which this concept (8) is a species. 

We can now take the root of our ruly word en- 
package, viz ENPACK-, and by means of a series 

- of inflecting codes, which we call “modulants,” 
build the concepts of—- 


(1) Process 
(2) Apparatus 
Work 

(3) Product 

(4) Starting-material 

(5) Intermediate-product 

(6) Condition 

(7) Made-from 

(8) Combination-including 
where the species ofa genus is shown by indenting 
the species under its genus. 


OLE LL LL LLL Le 


-As pointed out later, we may wanttoinclude as a 
descriptor, the common name of the thing being 
itemized. For want of a better means of identify- 
ing these non-ruly names of things, we shall use 
in the examples which follow a modulant called 
“named-thing.” 

Other modulants will easily come to mind. It 
seems Clear that a series of modulant codes can 
be devised with which descriptor word-roots can 
be modulated to show the particular concept present 
in a disclosure to be encoded. Such a code has 
fot yet been worked out in detail. 

These modulants are inflectors of roots to allow 
them to serve as descriptors, and modulation is 
desirable because the root form may be retrieved 
without the modulant when making a generic search. 


RULY ROOT CODING 


The coding of the unmodulated roots presents a 
linguistic problem of no meancomplexity. Crea- 
tion of generic hierarchies of these roots which 
will be meaningful is necessary. The term “steam” 
as aprocess, is,specific to “evaporate,”asa prod- 
uct it is specific to both “water” and “fluid” while 
as an apparatus, it is generic to a “steam genera- 
tor” (i.e., a boiler). in the beginning, we contem- 
plate the necessity of utilizing a plurality of 
hierarchial codes for each root, and as our coding 
develops, we may be able to combine someof them 
and reduce their number. 


THE INTERRELATIONAL CONCEPT 


As pointed out before, relations between items 
are encoded by means of coupling descriptors with 
interfixes. In other words, in addition to modu- 
lating the root word to form a descriptor the in- 
terrelation must be shown by some additional in- 
flection of that word. These inflectors will be 
called “Interrelational Concepts.” They are often 
prepositional in form. 

Early in this effort, it was discovered that the 
meaning of any preposition was very elusive and 
ambiguous. 25 Basic English words which were 
recognized as prepositions (about, across, after, 
against, among, as, at, before, between, by, down, 
for, from, in, of, near, off, on, over, thru, till, 
to, under, up and with) were accordingly studied 
with an attempt to discover what basic concepts 
they portrayed. 

Each of these prepositions was then further 
analyzed for as many of its unambiguous meanings 
as could be found. Each meaning was givena 
number which we call a “meaning number” and 
in lieu of definition, an equivalent expression and 
examples of its use were listed. (In this work, 
Mr. C. G. Smith of this Office, gave considerable 
time and advice). An example of the breakdown of 
the preposition “thru” follows: 


oe 


THRU 


Meaning # Equivalent Expression Examples of Use 
1 Between the boundaries of A hole extending thru a board 
2 Between and across the bound- ‘The route of work thru a machine; An arrow extending 
aries of thru an apple 
3 Everywhere in An odor pervading thru a Room; Poison thru the dead 
body 
4 Progressing from beginning to To pass thru a doorway; A bill passing thru the legisla- 
end of ture; To read a book from cover to cover 
5 On a route between portions of | The flight of an arrow thru the air; A ray of sunlight 
shining thru the trees 
6 Here and there in Pollen floating thru the air; Raisins scattered thru the 
bread dough 
ul By the way of To cure thru an operation; To change thru legislation 
8 Because of The water froze thru loss of heat; To err thru igno- 
rance 
8 By use of A solution thru calculus; To speak thru an interpreter 
10 Be finished with At 6:00 P.M. he was thru with work 
ll Something simultaneous with To hear thru the din; To see thru the fog; To see thru 
and despite of his deceit 
12 During and to the end of Busy all thru the year; To go thru life without eve: 
knowing 
13 By the mere existence of To be related thru marriage 
SCHEDULE 2 


Pending the coining of a Ruly word we distinguish 
the different concepts of the same unruly word by 
writing after the word the meaning number which 
we have assigned, thus: He was thru(10) with 
his homework. The sketch was first done in(8) 
outline. 

These breakdowns of the 25 prepositions were 
then scanned for redundancy, and a series of Ruly 
Words were constructed for some of the concepts. 
Equivalent prepositional phrases were also noted 
as we ran across them, their separate meanings 
were numbered and equivalent expressions and ex- 
amples were listed. As an example of such a Ruly 
Word, we have the concept Howby, which has the 
meaning, mode of proximate cause, and has equiva- 
lent meanings “resulting from the mode” and “by 
the use of the mode.” The terms collected from 
each of the pertinent breakdowns are shown in 
Schedule 3. 


With this analysis, we believe we have tied down 
the meaning of “howby” to a single concept. 

Next these Ruly concepts were organized into 
generic hierarchies, the most generic being placed 
farthest to the left, the Subgeneric indented there- 
under and the most specific farthest to the right. 


Kor example, the several categories of Cause fol- 
Ow: : 


CAUSE, the proximate cause of a Result 


Ruly Word Explanation 

CAUSBY Proximate cause 

HOWBY Mode of causing 

MEANSBY Means for causing 
SCHEDULE 4 


As stated previously interrelational concepts are 
complementally inflected into mirror-images of the 
Concept, when distributed. E.g., in the case of 
cause, the thing causing an effect results ina com- 
plementary result. E.g., “a cut fromaknife” could 


be itemized in unruly English as: 5 
Item # Descriptors Inter fix 
201 Cut 
Resulting-from 7 
202 Knife 
Caused-by 7 
SCHEDULE 5 


s(ye 


HOW BY 


Term and Meaning # Examples of Use 


Prepositions 


As(3) — To limp as the result of a 


fall 


By(14 part,* 18 & 21)—To take by force; To teach 
by example 


From(10 part*) 
In(8) 


— To gain a polish from wear 


— To sketch in outline; To 
argue ina circle 


Of(25 part*) -—To go of one’s own will 


Thru(7, 8 & 9 part*) —To cure thru an operation; 
To gain thru legislation; 
a solution thru calculus; 
To freeze thru loss of 
heat 


To(43) 
With(7 part*) 


—To succumb to a force 


--To kill with kindness 


Phrases 


Because-of(2) — To acquire polish because 


of wear 


Result-from(2) --A cure resulting from an 


operation 


By-use-of(1) --A solution by use of calcu- 


lus 


*By building single concepts, we discovered that 
many of the prepositional meanings we thought were 
unambiguous still covered more than one meaning, 
and we had to split the meaning numbers already as- 
signed. Hence in bdy(ju) we, used only some of the 
examples, and left the others for another concept. 
We also noted occasions where an unambiguous mean- 
ing had been given several meaning numbers and we 
therefore combined by(18)-and by(21) with part of 
by(14) in making Howby. 


SCHEDULE 3 


CODING OF CONCEPTS 


In order to handle these concepts in “item” form, 
and at the same time to utilize the generic 
hierarchy form, a specific system of coding them 
was adopted, which we call “compliance coding.” 

These compliance codes are binary in form, 
that is, the presence or absence of qualities is 
noted by 1’s and O’s in separate columns. These 
codes are organized and arranged so that the most 
specific concepts have the fewest 1’s in their code. 
Subgeneric concepts add additional 1’s in other 
pit columns, and the most generic code includes 
a 1 in every column in whicha subgenus or species 
exists- 


The coding of the Cause-Result concept, with 
addition of Ruly words to enable one to easily refer 
to the codes follows: 


Ruly Word Code Explanation 
CAUSBY O@0 x i sa Result of 
CAUSFROM tL 8 Caused by 
HOWBY TF .0: By 3 Result of process 
HOWFROM 1 1 0 O- Process causing 
MEANSBY [1 a) ee Mechanical result of 


MEANSFROM 1 0 0 O + £4Mechanism causing 


SCHEDULE 6 


Causby, it will be noted, has a 1 in each of the 
last three columns, and a O in the first column. 
Any other word which has a O inthe first column and 
a 1 in one or more of the other three columns is a 
species under the genus Causby. Hence Howby and 
Meansby are both species of Causby, but by the 
same rule, Meansby is also a species of the sub- 
genus Howby. Ina like manner we have the genus 
Causfrom, the subgenus Howfrom and the species 
Meansfrom. 

Now we can take the example of Schedule 5 and 
using our ruly concepts, we have: 


Descriptors Concepts and 
Item # and Modulants Interfixes 
201 Cut—(product) 
Cut—(process) Meansby 7 
202 Knife—(named-thing) 


Cut—(apparatus) 
Cut—(process) Meansfrom 7 


SCHEDULE 7 


An example of a more complex compliance code 
is that shown in Schedule 8. 


In this hierarchy, it will be noted that neither 
Syncwith nor Timnear are grouped with another 
word. In these cases, the complementary concept 
terms are identical in both original and “mirror 
image” form. We alsonote that Timnear is generic 
to both Timafor and Timaft, though neither of the 
latter are subgeneric to each other. 

Only a few hierarchies of interrelational con- 
cepts have been formed to date, and not all of 
them have been coded. A number of concepts have 
also been collected, for which no hierarchies have 
been formed. The great bulk of the concepts en- 
compassed by the 25 words so far analyzed have 
not been collected. We are experimenting with a 
more direct approach in forming these concepts. 


-7- 


Ruly Word 
SYNCSTART 1 


SYNCBEGIN 0 
SYNCSTOP 1 


SYNCEND 0 
SYNCWITH 0 
DURING 1 
WHILE 0 


RECURPER 1 
AFORLAP 0 


AFTLAP 0 
TIMAFOR 0 


TIMAFT 0 
TIMNEAR 0 


Code Time Diagram Explanation 
0. 0 x 7 
eer } unegua simultaneous beginning 
0 0 1 0 1 O- Xxxxxxx 


Pao e 0 OO x 
Unequal, simultaneous end 


a ET ee Re MR oe eo 2 


mace Simultaneous 
So oo 8 FT XXXXXX 
AD “0 6) © 78 x 
once 
IP a) 2) oh 1 0 acer | shorter during longer 


} repetitive 
a oo 8 O° Kix x 


0 0 1 0 OO xXxxxxxxx 
Overlapping periods 
Bee es Loe XXXXXXXX 
a 0S 0 2-2 eee Before 
and 
ao > 0.10» Oe 2d XXXXX after 
1 1 0 0 © 1 (not illustratable) Sequential, no sequence expressed 
SCHEDULE 8 


RELATION OF MODULANTS TO INTERRELA- MORE COMPLEX ENCODING—USING METHOD 
TIONAL CONCEPTS AS EXAMPLE 


Since many, if not all, the modulants are used to 
show relationships, it may well be that the modulant 
codes, when created, will be closely related to the 
interrelational concept codes. This relationship 
has yet to be analysed. 


Item # 
209 


210 


211 


Now let us return to the itemization of a simple 
procedural method, e.g.: Filling a glass measure 
from a china pitcher and emptying the measure in- 
to a larger metal container. In the schedule which 
follows, the substitutes for modulants are enclosed 
in parentheses and the prepositional concepts, used 
instead of their codes, are in italic. 


Unruly-Root and 
Modulant 


Pitcher --(named-thing) 

Contain --(apparatus) 

Lip —(combination-including) 

China -(made-from) 

Dispense—(apparatus) 

Dispense—(method) Sromout-1 


Ruly Concept and Interfix 


syncwith-2  timafor-6 
Measure—(apparatus) 
Contain —(apparatus) 
Glass -—(made-from) 


Size lesser-5 

Dispense—(method) into-1 syncwith-2 timafor-6 
Dispense—(method) fromout-3 syncwith-4 timaft-6 
Contain -(apparatus) 

Metal —(made-from) 

Size greater-5 

Dispense—(method) into-3 syncwith-4 timaft-6 


SCHEDULE 9 


-8- 


In this schedule, we find item 209 directed to a 
thing called a pitcher variously described as a con- 
tainer, a thing having a lip, a thing made of china, 
a thing called a dispenser, and interrelated in a 
dispensing process step between the pitcher and 
some other thing. Interfix 1 shows that this step 
is “out of” the pitcher and that it is “into” the 
measure of item 210. Interfix 2 shows that this 
dispensing-receiving step is simultaneous. Interfix 
6 shows that this dispensing-receiving step occurs 
before a second dispensing-receiving process step 
between the measure 210 and the container 211. 
Interfix 4 shows that this second step is also 
simultaneous. 

We note that the measure of item 210 is made of 
glass and by interfix 5, that it is smaller in size 
than the metal container of item 211. 


SAME CONCEPT DIFFERENTLY EXPRESSED 


Coding will obviously not all be done by one per- 
son, nor will the question for retrieval normally be 
framed by the person who did the encoding. An 
example of efficacy of the “interrelational except” 
in this situation may be interesting. Consider the 
simple disclosure: “The water is emptied from 
the pitcher.” And suppose a search question is 
framed in the form: Find “The pitcher is emptied 
of its water.” 

Coding disclosure and question we have: 


Unruly Root & Ruly Concept and 


Item # Modulant Inter fix 
Disclosure 
27 water-(work) 
empty-(process) fromwhence-6 
28 pitcher-(named-thing) 
empty-(process) whencefrom-6 
Question 
1 pitcher-(named-thing) 
empty-(process) whencefrom-1 
2 water- (work) 


empty-(process) fromwhence-1 


SCHEDULE 10 


and we note that the twosets of items are identical, 
although the order is changed. Since the order of 
items is not material, we see that the question will 
retrieve the disclosure. 

These two sets are identical because the two con- 
cepts are identical. This is preordained in view of 
our analysis of from(3), listed as: whence, e.g.: 
“He took a penny from his pocket” and of(14) as: 
out from, e-g.: “It was a wine of France.” These 


“were both selected as elements of the ruly word 


whencefrom, along with on(18): out of, e.g.: “His 
check was drawn on the bank,” and off(1): remove 
from, e.g., “He cut the end off the stick.” 


SERIAL NUMBERING 


The handling of complex structures, whether 
static or dynamic, presents further problems. A 
table has plural legs, each of which may need 
separate identification. A transmission similarly 
has plural gears. To take care of this situation, 
we propose a complex notation which we call Se- 
rial-Numbers. Like our interfix, this will involve 
a blind retrieval process. These numbers will be 
assigned so that any larger combination will have 
the same significant figures as each of the sub- 
combinations which belong to it. Referring to our 
first itemized disclosure, (SCHEDULE 1) we might 
assign serial numbers as follows: 


Entire-Disclosure Lo Gg © 
Table-and-Contents iid 8 
Table-top-and-contents 1 1 1 0O 
Table, first-leg it 2 6 
Table, second-leg & 2 8 0 
Table, third-leg, etc. E dO 
Table-top % 3, 8 2 
Book bal od 2 
Ash-tray ck ESS 


SCHEDULE 11 


The use of this serial-number notation appears 
necessary, but the utilization of it from the stand- 
point of retrieval has not been studied in detail. 


MODIFYING CONCEPTS IN GENERAL 


The modifying concepts of English, i.e., the ad- 
jectives, adverbs and prepositional phrases can 
apparently all be handled by our system. Adjec- 
tives fall in several classes, each of which re- 
quires a different technique in coding. 

First, there are the purely descriptive modifiers 
or qualifiers. These are recognizable because the 
sentence in which they are used may be modified 
by making the word a predicate adjective following 
the verb to be. E.g., “A cold press” is equivalent 
to “The press is cold.” These words will be entered 
as modulated descriptors. 

There are next those modifiers which show the 
role of the modified noun, or something concerning 
it. These are usually concepts involving another 


-9- 


thing and require a separate item with a distrib- 
uted concept. “A power path” expresses the con- 
cept “a path for power,” similarly “the Potomac 
Bridge,” “a bridge over the Potomac,” “a telephone 
call,” “a call on (or by) a telephone,” etc. As an 
example of the role situation, we may itemize the 
expression, “a cracker box” by: 


Unruly Root and Concept and 


Item # Modulant Inter fix 
428 Box—(named-thing) Containing-4 
429 Cracker—(named-thing) Contained-in-4 


SCHEDULE 12 


This method of handling qualifiers requires un- 
ambiguous contexts for encoding. “A German Book” 
must be encoded as either “A book inGerman,” “A 
book from Germany,” or “A book about Germany,” 
according to the context. 

The combination-subcombination relationship, 
which is a modifier of this form, presents some 
problems if the Serial Number notation referred to 
above is adhered to. As pointed out in Schedule 10, 
other means has been proposed for handling the 
expression “table-top” when referring to the spe- 
Cific top of a specific table. It has not yet been 
determined in which manner such concepts will be 
encoded. 


QUANTIFIERS 


The quantifiers, i.e., adverbial words modifying 
adjective words, are indications of relative posi- 
tion on a scale. As such they are interrelational 
concepts, and can be handled in that way. See, 
e.g., Schedule 9 where the metal container is 
larger than the glass measure. But as has been 
pointed out before, some relations are expressed 
generally without being interrelated to another 
thing. We have spoken of an ash tray as a con- 
tainer, without interrelating it with the smoker’s 
waste material, and we have called the book blue, 
without interrelating it to a standard color chart. 
Hence such ambiguous statements as a “large” ash 
tray or a “light”-blue book can not be coded as 
interrelational concepts. Whether they canbe coded 
as modulants has yet to be investigated. 


INVARIABLE CODES 


Certain aspects of disclosure are normally used 
in conjunction with another concept. These are the 
aspects of temperature, weight, elapsed time, 
volume etc. It appears clear that a code for en- 
coding such measurable items could use numerical 
values preceded or followed by a fixed code nota- 
tion which would mean, for example, time (in 


minutes), volume (in cubic meters), temperatures 
(in degrees Kelvin), etc. Such coding will be called 
“invariable coding” since no modulation or no 
genus--species relationships occur in these codes, 


INDEX NUMBERS 


Since many things will undoubtedly be searched 
for by common name, we will compile an alphabeti- 
cal collection of common terms with index numbers 
for each term. Whereatermhasa different mean- 
ing in different arts, two index numbers will be 
given, for example: 


BRAKE 


Motion Snubber --1,000,000,071 


Sheet Metal Bender -- 350,896,253 


SCHEDULE 13 


Such common names can then be listed as a de- 
scriptor in an item, and will thus allow retrieval of 
such things by their common name. 


SPECIAL RELATION CODING 


Certain special aspects, which can be handled 
by some or all of the details already referred to, 
can also be handled in other ways, which have cer- 
tain advantages and solve other problems. They 
have been exploited for the solution of these latter 
problems, and it is possible that future research 
may generalize on these techniques for other and 
different problems. 

Many of the interrelationships searched in the Pa- 
tent Office have a dominant-recessive character, 
others are equi-relative. E.g., atractor (dominant) 
pulls a trailer (recessive), a table (dominant) sup- 
ports a book against the pull of gravity (recessive), 
but two facing houses are merely opposite to one 
another (equi-relative). During these relation- 
ships, there may be motion of the things recited in 
the related items which we shall call “dynamic,” 
i.e., the tractor-trailer example; or the things 
recited in the related items may be “static,” i.e» 
both the book-table and the house-house examples. 


By reserving the use of three binary columns, we 
can encode these relationships. In the first posi- 
tion we can put a l for static and a 0 for dynamic. 
The next two bits together can indicate either the 


dominant-recessive or the equirelative condition 
thusly: 


dominant: ar. (9 
recessive: Ola 
equirelative: 0 0 


SCHEDULE 14 


-10- 


This technique solves a specific problem in a 
chain sequence of conditions where, e.g., A drives 
B, B drives C, C drives D. With this type of dis- 


closure, one might wish to retrieve C driven by A. 
Itemizing with this technique: 


Static- Dominant- 
Motion Recessive 
Item # Column Columns Inter fix 
503 A-(named-thing) 
Drive-(apparatus) 0 1 ae 4 
504 B-(named-thing) 
Drive-(apparatus) 0 L414 4 
505 C-(named- thing) 
Drive-(apparatus) 0 td 4 
506 D-(named-thing) 
Drive-(apparatus) 0 Oa 4 


SCHEDULE 15 


we note that A is dominant only, B is recessive as 
to A but dominant as toC, etc. while D is recessive 
only as to C, By wording aquestion: “Find A with 
a dominant drive and C with a recessive drive,” 
we can retrieve this portion of the disclosure, 
when the question is framed as stated above. 

This technique is also adaptable in other situa- 
tions. Consider the thing A which is taken from 
B to C and from C toD and from D to E. We can 
again use two adjacent columns with the code 1 0 
as “from” for B and 0 1 as “to” for E, and the in- 
termediate stations C and D would use the code 
1 1 showing that they received the item “from” the 
item ahead and sent it “to” the item shown in Sched- 
ule 16. 


This shorthand system cuts out the use ofa con- 
cept column. However, if time was of the essence 
it could be handled with the interfixes assigned to 
the specific concepts, though each word involving 


both “from” and “to” would have to be repeated, 
as shown in Schedule 17. 


Descriptor and From-To 


Item # Modulant Columns Interfix 
227 A-(work) 

Transport-(method) 4 
228 B-(named-thing) 

Transport-(method) 8 4 
229 C-(named-thing) . 

Transport-(method) 1 ae | 4 
230 D-(named-thing) 

Transport-(method) Tey ae 
231 E-(named-thing) 


Transport-(method) Oo. 2 4 


SCHEDULE 16 


Descriptor and From-To 
Item # Modulant Concept and Interfix Columns 
227 A-(work) 

Transport-(method) 
228 B-(named-thing) 

Transport-(method) Timafor-5 i 40 
229 C-(named-thing) 

Transport-(method) Timafor-5 ou 

Transport-(method) Timaft-5 Timafor-6 4 1+ 0 
230 D-(named-thing) 

Transport-(method) Timaft-5 Timafor-6 4 1 

Transport- (method) Timaft-6 + le O) 
231 E-(named-thing) 

Transport- (method) Timaft-6 4 [Ye Ma 


SCHEDULE 17 


-ll- 


CONCLUSION 


This constitutes the status of this project to date. 
Since the interrelational concepts now seem to be 
the most easy to derive, additional concepts and 
codes are now being worked on. Many other loose 
ends are clearly evident and need tying up. Much 
Manpower and time are needed in deep research 
where, to date the surface has merely been 
scratched. Constructive criticism and comment 
from others will be most welcome. 


REFERENCES 


(1) Mechanized Searching in the U. S. Patent Of- 
fice M. F. Bailey, B. E. Lanham, and J. Lei- 
bowitz, Journal of the Patent Office Society, 
Vol. 35, pp. 566-587. 


Advances in Mechanization of Patent Searches 

B. E. Lanham, J. Leibowitz, and H. R. Koller 

Presented before Division of Chemical Lit- 

erature 129th meeting of the American Chemi- 

cal Society Dallas, Tex., April 11, 1956. 
Comm--DC--43949 


= je 


(2) William N. Locke and A. Donald Booth “Ma- 
chine Translation of Languages,” John Wiley 
& Sons—1955; pp. 167-173, inc. 


(3) Classification Bulletin #402, U. S. Patent Of- 
fice, 1951 containing the definitions of the 


classes and subclasses of class 53, Package 
Making. 


James W. Perry, Allen Kent, and Madeline M. 


Berry “Machine Literature Searching” 1956, es- 
pecially pages 84-89. 


Since this paper went to press this new book has 
been received. The close similarity between our 
Modulants and the authors’ Analytic and Synthetic 
Relationships is noted. The distinctions they draw 
between their two types of relationships do not, 
however, appear usable in the solution of the Patent 
Office problem. E.g., An “insect acted upon by an 


insecticide” (Analytic symbol W) is both a “start- 
ing material” (Synthetic code KAJ) and a “ma- 
terial processed” (Synthetic code KEJ) in Patent 


Office reasoning. 


= 


ee 


0402 00084381 


DATE DUE 
28 OCT 1978 


elaliessstahy 
ea 


(arate 
oes 


Feecleaal 
es. 


tarsoat 
igeyaral 
meaty 


stint 
vt ro se 
ie 


ae 
paren) 
brett oF 


sock aly 
peter hdaries 


+ sete telat 
esa ea 
i eter ii aa 


Bove brs tas 
fee erate 


it 


bd si 
sp randl sical 


Sih 
boner saG snus 


ee 


iver eaet satchel 
arabereteariclarieerecatettaine as 
Sur rebate inpernpnsboaponsrense ys tree 
apy tated Sh 


assets 


ep ebeen, 
pacipetabel Gupr eertaveessoel 


Teeesetieel centgey 
iter esne 
Sess epeemcnt bat ed te 
3 sti tecteretatattet a ee 
shoytar? + a boa tpt may ene g ht ieeh Gina hand ; 
Rerergsttactine aiwetrat entice tatiana armani aay 
Ppibrechrgaccbtt Pachter Gulcrenlargia sh mk 
Dares a : kek Ake 
t rettisha nese presi “¢ 
AT Stetcatasecenicme ed fer 
Be cnet 
‘ 


i 
sine 


ee 
Pen 


olen 
retiy 


maguers ties 


prek 
peer 
‘ 
eenieyehee ateteaeipetncE ce 
Popes tatyaren nag vere yore eae dni tere aie Gera eerrpe dt ai gw aess 
Sukie eC Heystt =. sit aoa 
: ets 


a 
eae 


wa 

Ripalpere rie 
Seances 

Tay eietetaietets Seb ats 


